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Appids & Fingerprints 





xkeyscore@nsa 






TOP SECRET//COMINT//REL TO USA, FVEY 











• Syntax is similar to C: 

function('name\ level, coptional info> ) = 'search terms and 
patterns'; 

• Two main functions - appid and fingerprint : 



appidCchat/icq 8.5, wireshark='icq', chatproc='ICQ') - 

/[~o]icq/c and $icq; 



fingerprintC fingerprint/phone/nokia/generic') = 

1 user-agent : nokia' or 
' profile : http://nds. nokia. com/ uaprof/n 
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Appids are named using a pseudo directory 



convention: 



/application type/subtype/name 
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Levels are 1,0 - 9.9 with lower numbers meaning higher priority. 



This allows multiple signatures to match a piece of traffic, but 
only the most specific appid will be applied. For example: 



appid('chat', 9.9) = ... 
appid('chat/yahoo', 9.8) = ... 
appid('chat/yahoo/incoming', 9.7 = ... 



If a session matches all three signatures, the appid will be 
'chat/yahoo/incoming' since that has the best priority. 
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Third parameter is the application type; if missing, we use 
the appid name up to the first slash as the type 









appid ('http/response 1 , 9.2, 'web') = ... 
appid('chat/yahoo/inconning\ 9.1) = ... 
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c Search Patterns 




XKEYSCORE supports Boolean operations and regular 
expressions 

Raw text must be encapsulated between single quotes 



• 'search term' 



Terms can be combined with Boolean logic 



• 'search term' and 'another term' and not 'defeat term' 

• 'search term' or 'another term' 
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appid('voip/sip/IMS\ 6.0, wireshark='sip’) = 

('via: sip 1 or v: sip 1 ) and 'cseq:' and ( 
'p-access-network-info:' or 



'p-called-party-id:' or 
'p-charging-vector:' or 
'p-charging-vector-addresses:' or 



'p-media-authorization:' or 



'security-verity:' or 
'proxy-authorization:' and 'scscf' or 
'path:' and 'pcscf' or 
'path: 1 and 'scscf 
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ex Pattern 





Binary patterns can be represented by putting a \x in front of 
each value: 



'\xff\xff\x00\x02 ' 



Or use the hex function: 



hex('ff:ff:00:02') 



Use slashes to enclose regular expressions: 



/[ ~ a-zA-ZO-91 BTE/ 
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Keywords and regular expressions are NOT case sensitive 
by default. 



Append a 'c' to request case-sensitive evaluation: 



'keyword'c 

/regex/c 
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Keywords must be at least 3 characters or they will never 
hit. This minimum is increased to 4 at some sites for 
performance reasons. 



Regular expressions must include a fixed "anchor" meeting 
the minimum keyword length. 



Bad: /[A-Z]{3}-[0-9]{3,5}/ 
OK: /ABC-[0-9]{3,5}/ 
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Each session gets one appid - lowest level wins. It gets 
databased in the 'application' field. 



All matching fingerprints are stored in the 'fingerprint' 
field. Level is ignored and can be omitted from 
fingerprint definitions. 



Application Type* : 
Application Info* : 
Application: 

AppID C+Fingerprints)* [fuHtexjj]: 



/ Winning appid 

, Winning appid 
V all fingerprints 

tKi fField Builder! 



+ 
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appid( 1 mail/yahoo ' , 9.0) 
appid ( 1 mail/yahoo/login , 



= 'Host: mail. yahoo'; 

8.0) = ’Host: mail. yahoo' 




and 




' /login ' ; 



fingerprint (' mail/arabic ' ) = 1 mail ' and /language [ :=] ?ar/; 
fingerprint (' mail/yahoo/ymbm ' ) = 'Host: mail. yahoo' and JBM= ’ c; 



GET /login. html HTTP/1.1 

Ref erer : http : / /us . f 359 . mail . yahoo . com/ym/ShowLetter 

Accept - Language : ar 

Accept -Encoding: gzip, deflate 

User -Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) 
Host : mail . yahoo . com 
Connection: Keep-Alive 

Cookie: B=fn50ehd2612o2&b=3&s=rp; YMBM=d=&v=l; 



Application: mail/yahoo/login 

Fingerprint: mail/yahoo/login mail/arabic mail/yahoo/ymbm 
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UtiU Begin ASRAR El Mojahedeen v2.0 Encrypted Message ### 
r/RgTzT/ATRhN2E1Zjg1 OWQyNWRjMmE22TdlNzZm2DhlODUxZWZhMDQ1 Mj YwMjViZGUO 
ZGYwMjdkMmJmNTA4ZDY2YjkOMGU2NGNiYjg6MzHjZTc6MThjY2 Y1ZmY6MTgzZDIkYjhjMTE 
x0GYzYjc1ZDdiMDAxNTQzZmVINDVIY2YyMGJjYjU20DkyYjdmYjFjYjAzMWM6ZDQ20WFIMzg 
4NThhM2l1Mjc50DkzZGNhOGRmNWJmNjVIZjQOMjMxNDM4MDIyO TglMmRjMGJiNGNkYTN 
kYTQ4MzMxZjRiN2FiNjl3MjE1 NGI3MTA3ZDQ4NWRmYzMyOTUzZ JZIMjg3NjQ1 OGQ4MTA3N 
TU2N2ZkN2ZjYzUzYzYyMjFIODAwN2VkM2U5MTZiNDY2MmM2ZTV IYjQ2YzlOOGQ20DUxNW 
VkMjl2MWViNDAyOGIOMThkMTdhNTY1YzlxMDgyOGZIM2lwZWZj MDgwM2U4MzNINDg1 OD 
UxZTc40Dc1 MTY2M2IONjU5ZjBhZjVhNjkOOTIhNGExOThmYWVI NmFIZjlyNmMwZDA3MDM0 
NjJkZDhhMml4ZmRhYjc3NmZINDFk0DkyYjBhYjY3MDQ10GVIMj dhYmUwZTIyNGIxYmQyZDIz 
ZjliM2E5ZGQ5NmNhZDQxOTM4NTI0Mjc3MzBIOWEwZWE1Njk3Yj gxY2ViNTQ1 OWULnoi/VD 
ULIjTEuDJqneOGMRHesiBPTnZj02yqbmKbFklPjwMhe7FUhFAOw74S+i+PokOREo5XhdP+y9 

/ 

Gul3juYTvrlE0xGx20sSfNS5kfRXXH1DaTnb70yLife9r6mMIQ6 

e6E0SRUIdU6YVupz0hhgddDof 
SBbFR3OvgOS+pUxCYgmEOr/RA+fYi47tuHQMh+dynZqQspNdmRUmkjEpFqF03sPHS/1Cinjqo 
e1GsfB+in52XE2q/WdnU+4XjWnl/isVNAjv2nsL+52TG1IHbgocmpQoxy0B0SXPcRv/+2JekV37 
kl XyONZk9YH+DV3aWYPXt+yrn+wG0XKIT qPHIUI JWAZql2NK/cSXt9DMtCtcb8czRj6G9IXvJ9 
Eny7tD6xPd9BGio9M+3QuUkZHLEmJiAvgvB6R/X/3whBqk6zMHQI_fo+VJcX9umW5mRtgCjzS 
PW6lzzFCGtB4SK4PxT52ZC0B2kWD8VMyNffrlsTG4XUesgx47Nd5xML8w5pj/fZwKNK+EfKIP 
==Z1ow29A9N3uLIXBX62LhOyjf1iqfJ2FNR7AIONS^jwKoggVmkxDiuGaQi+TurpxBgat1g 



Q I j Ipd Ig obl-ijr-iJ 



F-iJiiQiJ I 1 IjASj 



XJpJ I ’-I I'l I I I ic ■[ 1 ^ I 'jJ I J'-I -|-J I 



I -fiLoJI , 



I ', L -TtJ I , L 1 - -!^.' 






■ UL.... JI q..:dl9 



-ii- -i -T- 1 - Qj -| J -| I. 



1 Ij J I FL'Jj 



LjlAbaoJI 1 L'J'-Tii 



>_ZJ JjL.j_-.i 3 ly-'I 






Ci lj'-b~-noJ I 1 L'-i-sii 



ll Uj 






I C' LS iLJl'-oJ I 



HjJlmoJI ;^-i-j:i Io.lU I 



■AJkoJI o l-S •,l_Cu:J I 



O Lr- yC loJ 



i_i.j_aJ b -i-jJi-J I 



|jii bd I / Ijj.-j'j .LLJ 



g_A 9 J I O LaJ jo J I 



MU End ASRAR El Mojahedeen v2.0 Encrypted Message ### 




Displaying 1 items 



Hidden fields 
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• Appids and fingerprints are distributed across the 
XKEYSCORE network every hour 

• Changes will take effect within 2 hours of check-in 

• Current definitions are available on the website: 



http://xkeyscore.rl.nnsa/documents/appid.htnnl 
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Intermediate Syntax 









TOP SECRET//COMINT//REL TO USA, FVEY 



MINT//REL TO USA, FVEY 







You can append derived metadata fields onto the end of an 
appid: 



appid('p2p/kazaa\ 7.7, append= , mime_type') = 
Ipos('x-kazaa') and not $http; 



This will result in an appid like 'p2p/kazaa/image/jpeg’. 
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It in functions 






ip( expr ) 


Matches against an IP Address looks in to address and 
from address in the session headere 

* ip( '10.10.10.1' ); 


toport( expr ) 


Matches against the Destination/To port. Note this 
must be a numeric representation of a port. 

• toport( 1920 ); 


from port ( expr ) 


Matches against the Source/From port. Note this must 
be a numeric representation of a port. 

* fromport( 80 ); 


port( expr ) 


Matches against the either port. Note this must be a 
numeric representation of a port. 

•port( 6667 ); 


next_protocol( expr ) 


Matches against the integer version of the next 
protocol. 

• next_protocol( 250 ); 


protocol ('text') 


Will only work for IP next protocol names as 
defined in the IANA next protocol numbers 
document 

• protocol ('tcp'); 
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emai[_address(sel) 


permutes just like strong selector (just like 
DECO DEO RDA IN 


mac_address(addr) 


Tasks a mac address 


smac(addr) 




dmac(addr) 




ip(addr) 


tasks this IP address (either to or from) 


fromjp(addr) 


tasks this IP address only when it is the originator 


to_ip(addr) 


tasks this IP address only when it is the destination 
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ore built in functions 



first(expr) 


Matches against a pattern at the beginning of 
the session 


Ipos(expr) 


Matches against a pattern at the beginning of 
each line (\n) 


pos( expr ) 


expression occurs at offset X in the session 

• pos('Hello') == 5, 

• pos(/Good.*Grief/) <=10 


between ( expr ) 


• between('Hello', 'World', 10, 100) 

Separation between 'Hello' and 'World' is 
greater than or equal to 10 bytes and less than 
or equal to 100 bytes 

This is the same as using the following regular 
expression: 

• /Hello. {10, 100}World/ 


'term'c 


Does a case sensitive match of the term 


'term'u 


Treats the term as UTF-16 
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appid( , voip/skinny(port2000)', 9,9, wireshark=’skinny') = 
port(2000); 




appid('voip/skinny/keep-alive\ 3,0, wireshark= , skinny l ) = 
toport(2000) and 

1irst( \x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 ); 



appidOvoip/skinny/keep-alive-ack', 3.0, wireshark='skinny') = 
fromport(2000) and 

first('\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00’); 
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appid( l mai!/smtp/to_server l , 8.5, direction=$from_server, 
wireshark='smtp') = 



toport(25) and 



( first('helo') or 



firstCehlo 1 ) or 
first('data') or 

(lpos('To: 'c) and IposCFromi'c)) or 
IposCQUIT'c) or 
lpos('mail from: 1 ) or 
lpos('rcpt to: 1 ) ); 
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You can assign a pattern to a variable (CHAINWORD) and reuse the 
variable in many patterns. 



$sip = ' via : sip ' and ' cseq and ' SI P/2' c ; 



Now we can use this variable in future definitions: 
appidCvoip/sip' f 7.2 ) = $sip ; 
appi d( ' voi p/si p/i n vi te 6.9) = $sip and 'INVITE'; 
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There are a number of chainwords predefined for convenience: 




$tcp 

$udp 

$icmp 

$sctp 

$rpc 

$arp 

$ssl 

$http_cmd 

$http 

$http_get 

$http_put 

$http_post 



• $http_delete 

• $htfp_trace 

• $http_head 

• $http_options 

• $http_partial 

• $vbulletin 

• $mime_type 

• $user_agent 
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$icq = 'ICQ'c and $http and not (port(80) or $html_body or 
$http_cmd); 



appidCchat/icq’, 8,5, wireshark='icq', chatproc='ICQ') = 

/[~o]icq/c and $icq; 



appidCchat/icq’, 9.0, wireshark=’icq\ chatproc='ICO') = 

first(’icq') and not port(25); 
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Expressions are evaluated only with a certain context 
instead of across the session as a whole. 

html_title('Yahoo! Mail 1 or 'Yahoo! Address Book') 

... only hits if those keywords are seen within the 
title of a web page 

http_host(' maps.google.com') 

... only hits within the "Host:" HTTP header 
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Why use context-sensitive scanning? 

• More intuitive - you can say what you mean 

• More accurate - if 'maps.google.com' is mentioned in a 
blog post, you don't want to try processing it as a 
Google Maps session 

• Better performance for XKEYSCORE 
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Sample contexts 



html_title 

url 

http_host 

http_referer 

http_cookie 

http_server 

user_agent 

web_search 

to_cc 

from_cc 



filename 

file_ext 

docjitle 

doc_subject 

doc_author 

doc_org 

doc_hash 

doc_body 

email_body 

chat_body 
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appid('finance/currency_conversion/generic', 8.0) = 

html_title('currency' and ('exchange' or 'conver')) or 
http_server('currency' or 'x-rates.com'); 




appid('finance/currency_conversion/xe', 8.0) = 
http_hostCxe.com') or 
html title(/~XE -/c or 'XE.com'c); 
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appid options: 
--help 
--list-all 



--list-appids 
--list-fingerprints 
-list-types 
-list-levels 
-unit -test 



this help message 

list all the application/fingerprint names and 
levels 

list all the application names (no fingerprints) 
list all the application names (no appids) 
list all the application types 
list all the application levels 
perform unit tests with data in the heirachy 
'datadir', with files matching 'filespec' 

-quiet don't print any load messages 

-appid fname arg location of appid. cfg 
-input-file arg input file to test 

-datadir arg The test data directory. Defaults to 

$(XSCORE_TEST_DATA_DIR)/appids 

-filespec arg (=.*\.ul24) A regular expression to match against files to 

check 

-noexit arg (=0) do not stop on the first error 
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appid sample, ul24 
Loading appids 

-> Loading : /home/oper/xkeyscore/config/dictionaries/appid/appid_definitions.cfg 
-> Loading : /home/oper/xkeyscore/config/dictionaries/appid/anonymizer.appid 
-> Loading : /home/oper/xkeyscore/config/dictionaries/appid/bulletinboard. appid 
-> Loading : /home/oper/xkeyscore/config/dictionaries/appid/tao_vpn. appid 
-> Loading : /home/oper/xkeyscore/config/dictionaries/appid/tdmoip. appid 
-> Loading : /home/oper/xkeyscore/config/dictionaries/appid/terminal. appid 
-> Loading : /home/oper/xkeyscore/config/dictionaries/appid/voip. appid 
-> Loading : /home/oper/xkeyscore/config/dictionaries/appid/appid_definitions.cfg 
Finished loading appids 
Filename: sample. ul24 



Appid: encryption/https 



Total Size: 19.36Kbits 
Total Time: O.Olsecs 
Rate: 1.93 6M bits/s 
Overall performance: 

Total Time: O.Olsecs 
Total Bits: 0.01936Mbits 
Overall Rate: 1.93 6M bits/s 
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Keywords and regular expressions don't work for 
everything 



• Looking down columns in packet data 

• Checksums 

• Decoding (urlencoding, base64, gzip, etc.) 
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Hi I IMflffll 









Basic idea: 




1. Preliminary "trigger" using standard keywords 
and regular expressions 

2. Secondary test using a snippet of C++ code 
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Example — verifying a length field: 

appid ( 1 netmanagement/ospf 1 , 2, wireshark='ospf 1 ) 
protocol ( 'ospf ' ) 

: C++ {{ 

If (s±ze() < 4) 
return false; 

const uint8„t *data = begin); 
return (data[3]==size( ) ) ; 

}}; 
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fingerprint ( 'vpn/xxp_example' ) = 
'Next Protocol 250' 



: C++ {{ 

packet_t pkt; 
int count = 0; 

while ((pkt = get_packet ( ) ) && count < 20) { 
++count; 

if (pkt. size < 16) 
return false; 



if (pkt.data[4] != 0xCC || 

pkt.data[5] != 0x45 || 

pkt. data [15] != 0x72) 
return false; 

} 

return (count > 0); 

}}; 
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Example -- code-based check on certain extracted 



I 1 



files: 



fingerprint ( 1 crazy/office ' ) = 

extracted_file( 'doc 1 or 1 xls' or 1 ppt' 
return xks: : filename () . find ("crazy") 

}}); 



: C++ {{ 

!= std: : string: : npos; 
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Simplified regex-based metadata extraction 



fingerprint ( 1 maps/google/example 1 } = 
http_host ( 1 .google . 1 ) and url{ '/maps? 1 ) 

: C++ 

extractors : {{ 

ka = /Keep-Alive: ( \d+ ) / ; 

accept[] = /Accept- ( [ A :]+) : ([\w-]+)/; 

}} 

main : {{ 

if (ka && ka[0] == "300") { 

for(size_t i = 0; i < accept . size( ) ; ++i) 

if (accept [i] [0] == "Encoding" && accept [ i] [ 1] == "gzip") 
return true; 

} 

return false; 

}}; 
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Support for flex-based pattern matching 




fingerprint ( ' maps/google/f iref ox ' ) = 
http_host( google. ' ) and url('/maps?') 

: c++ 
flex : {{ 

USER_AGENT_CHAR [ A \n\r] 

%% 

"User-Agent : " {USER_AGENT_CHAR}+ { 
std : : string agent (yytext ) ; 

if (agent . find( "Firefox" ) ! = std : : string : : npos) 
return true; 

} 

}}; 
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The next step: giving code-based appids (limited) 
access to the XKS core 



• Accessing top-level session metadata 

• Throwing common events 

• Contributing metadata for databasing 

The goal: higher level of agility with lower 
learning curve 
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Example: accessing session metadata 






fingerprint ( 1 maps/google/ serverl 1 ) = 
http_host( 1 . google. 1 ) and url( 1 /maps? 1 ) 



C++ 



main : {{ 

return ( SESSION [ " to_ip" ] == "123. 45. 67.1"); 

}}; 
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Example: throwing a document_metadata event 






fingerprint ( 1 maps/google/contrived 1 ) = 
http_host( 1 . google. 1 ) and url( 1 /maps? 1 ) 
; c++ 
main : {{ 

xks : : doc_meta_t dm ; 
dm . filename = " google . txt " ; 
dm. author = "Google, Inc."; 
xks: :document_metadata(dm) ; 

}}; 
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Example: contributing metadata to HTTP Activity 



fingerprint ( 1 maps/google/search 1 ) = 

http_host ('. google ) and url( 1 /maps? 1 ) 

: C++ 

extractors : {{ 

q = /[&?]q=( [“&]+)/; 

}} 

main : {{ 

if(q) { 

DB[ "http_parser"] [ ,r search_terms"] = xks: : urldecode(q[G] ) 

DB. apply () ; 
return true; 

} 

}}; 
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